Polymorphic type framework for scientific workflows with relational data model
نویسندگان
چکیده
Scientific workflow systems provide languages for representing complex scientific processes as decompositions into lower level tasks, down to the level of atomic, executable units. To support data analysis activities, a wide variety of such languages represent data transformation and processing operations as task nodes within a workflow. Adding data type information to the task inputs and outputs allows workflow authors to perform type checking at design time, search for compatible nodes in public component repositories and define specifications of abstract workflows. Introducing support for strict data typing simplifies the implementation of a workflow system in addressing these issues, but at the expense of losing flexibility. We address this challenge by developing a data typing framework for scientific workflow systems that supports polymorphic data types that specify the minimal type constraints on node inputs and outputs. We focus on applications that use a relational data model and provide a polymorphic type formula composition algorithm for workflow nodes and fragments. The techniques introduced are validated by applying the inference engine prototype to an adverse drug reaction study performed with the relational algebra subset of the Discovery Net workflow system.
منابع مشابه
Actor-Oriented Design of Scientific Workflows
Scientific workflows are becoming increasingly important as a unifying mechanism for interlinking scientific data management, analysis, simulation, and visualization tasks. Scientific workflow systems are problem-solving environments, supporting scientists in the creation and execution of scientific workflows. While current systems permit the creation of executable workflows, conceptual modelin...
متن کاملAn Algebraic Approach for Data-Centric Scientific Workflows
Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational environments. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, parallelization of scientific workflows remains low-level, ad-hoc and laborintensive, which makes it hard to ex...
متن کاملHydrologists workbench: A governance model for scientific workflow environments
Scientific workflows (SWF) are an emerging approach that enables scientists to compose and execute complex, distributed scientific processes. The approach is premised on the ability to compose, publish, share and reuse workflows across distributed communities of collaborating scientists. Scientific workflow software (SWFS) provides a technical framework to compose, publish, and reuse SWFs toget...
متن کاملA Relational Framework to Explain the Town’s Local Actors Decision-Making Mechanism
The life of Towns has become more important and greatly emphasized in recent years and this heralds the arrival of a new era when this type of settlements is introduced as major living and investment capacities. Therefore, it is necessary to study the different aspects of towns in order to plan and manage their development and answer the question about different decision-making mechanism in the...
متن کاملExploiting ontologies and higher order knowledge in relational data mining Doctoral Thesis
Present day knowledge discovery tasks require mining heterogeneous and structured data and knowledge sources. The key enabling factors for performing these tasks include efficient exploitation of knowledge about the domain of discovery and utilizing meta knowledge about the data mining process, which facilitates the construction of complex workflows consisting of highly specialized algorithms. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJBPIM
دوره 5 شماره
صفحات -
تاریخ انتشار 2010